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Abstract 

Links between power law probability distributions and marginal distributions 
of uniform laws on p-spheres in R" show that a mathematical derivation of 
the Boltzmann-Gibbs distribution necessarily passes through power law ones. 
Results are also given that link parameters p and n to the value of the non- 
extensivity parameter q that characterizes these power laws in the context of 
non-extensive statistics. PACS: 05.30.-d, 05.30.Jp 
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1 Introduction 

The probability distribution (PD) deduced by Gibbs for the canonical ensemble 
Q][2|; usually referred to as the Boltzmann-Gibbs (BG) equilibrium distribution 

exp(-(3Ei) 

Pg(V = ? , (1) 

^BG 

with Ei the energy of the microstate labelled by i, (3 = 1/ksT the inverse 
temperature (T), fee Boltzmann's constant, and Zbg the partition function, 
can fairly be regarded as statistical mechanics' most notorious and renowned 
PD. In the last decade this PD has found a counterpart, in the guise of power- 
law distributions, with reference to the so-called nonextensive thermostatistics 
(NEXT). NEXT, or Tsallis' thermostatistics, is currently a very active field, 
perhaps a new paradigm for statistical mechanics, with applications to several 



scientific disciplines 0] |3J Ej • Power-law distributions are certainly ubiquitous 
in physics (critical phenomena are just a conspicuous example |7]). Now, as it 
is well known, both the BG and power-law distributions arise quite naturally in 
maximizing Salmon's (resp., Tsallis') information measure, the so-called Max- 
Ent approach, which is one of the most powerful statistics-theoretical techniques 
devised in the last 60 years. 

Our goal here is to show that the above mentioned probability distributions can 
be also derived via purely geometric arguments, which is sure to be of interest 
to the immense audience of MaxEnt practitioners. In order to motivate our 
approach we discuss first of all the 2— sphere. 



2 Physical motivation of the present work 
2.1 The 2-sphere 

Let us consider a dilute gas of N hard spheres in a box with hard walls and 
give these spheres some arbitrary initial distribution of momenta and positions. 
Classically, after a few mean free times have passed, we expect that the distri- 
bution of momenta Vi will be given by the Maxwell-Boltzmann (MB) formula, 

f MB {V) ocexp(- \\V\\ 2 /2mk B T) : (2) 

where the temperature T is given in terms of the conserved total energy U by 
the ideal-gas relation U = ZNksT/2, with fee the Boltzmann constant. 
This is so because the hamiltonian is simply 

ff=(l/2m)f>* = ^£ (3) 

i 

where V is a vector with 3iV components and 

N 

\\n 2 =Y, v i- w 

i 

Since the Hamiltonian H takes on the constant value U, the allowed values of 
V form a sphere called the 2-sphere. 

Suppose we now choose V at random on the 2-sphere. For this to be a meaningful 
statement, we need to have a measure which tells us which sets of P's are equally 
likely a priori. The obvious choice is to assign equal a priori probabilities to equal 
areas on the 2-sphere. Why should this be the choice? Because, according to 
Sinai [5] the hard-spheres gas is a chaotic system HJ. 

Thus, if we choose V at random with respect to this uniform measure, the 
probability that our choice makes an angle between v and v + dv with respect 
to any particular axis is simply 9 

f{v)dv& {s\nvf N - 2 dv w (sin z/) 3Ar ~ 3 d cost/ « [(1-cos 2 ^) 1 ^] d{cosu). (5) 
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If we now identify {2mU) 1 / 2 cos{y) as, say, the value of p\ z (the z component of 
the first particle's momentum), we find, with U — 3-/VfcsT/2 

f{piz)dpx z oc [1 - plJ2mU}^^ dp lz , (6) 

which is a power law distribution 0] ^] . In the large- N limit this probability 
becomes 

f{Piz)dpi z w exp (-p 2 lz /2mk B T)dp lz . (7) 

One recovers thereby the MB distribution for p\ z , passing through a power law 
one. This result reminds one of an entirely similar one advanced years ago by 
Plastino and Plastino, but from a very different viewpoint that uses the notion 
of canonical ensemble [H] . 

Now consider the probability distribution for p\ y when p\ z is fixed. It is given 
by the first line of with the 37V in the exponent(s) replaced by 3N — 1 (since 
there is one less coordinate when p\ z is fixed) and 2mU replaced by 2mU —p\ z . 
In the large- N limit we can neglect, of course, p\ z compared to 2mU, so that we 
find the MB distribution for p\ y . In similar fashion, one obtains, passing first 
through a Tsallis' distribution, the MB distribution for any k components of V 
as long as k <C TV. 

We will generalize the above intuitive notions below to the case where, in equa- 
tion 10}, the power to which the summands are raised is any integer p. 

2.2 Revisiting the equipartition theorem 

In classical statistical mechanics there exists a useful general result concerning 
the energy E of a system expressed as a function of N generalized coordinates 
qi and momenta pi. The result holds in the case of the following (frequent) 
occurrence 

1. the energy splits additively into the form 

E = €i(pi) + E'(q u ...,q N ,pi,... ,p l - 1 ,p l+1 , . . . ,p N ), 

where £i(pi) involves only the degree of freedom i (the variable Pi) and 

the remaining part E' does not depend on pi. 

2. the function &j(ft) is quadratic in Pi. 

Thus, (ti) = ksT/2. Any independent quadratic term in the Hamiltonian con- 
tributes this amount to the mean energy. This is the equipartition theorem |2J. 
Notice the similarity with the considerations of Section 1. Some light is thus 
shed on the equipartition meaning. The text-book demonstration assumes 2, 
the thermal equilibrium Bolztmann Gibbs probability distribution 

/ = |e"^, (8) 

where j3 = l/fc^T is the (Shannon-Boltzmann-Gibbs) Lagrange multiplier asso- 
ciated with the mean-energy constraint (E) = J drfE and dr the phase-space 
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volume clement. However, it has been shown in that the equipartition 
theorem can be generalized i) to a non-extensive statistics and ii) to cases in 
which the Hamiltonian is an homogeneous function of degree p. This last fact 
motivates the considerations that follow below. 



3 Geometric derivation of MaxEnt PDs 



3.1 Uniform distribution on the p— sphere and its marginals 

We say that a random vector X is orthogonally invariant if, for any deterministic 
orthogonal matrix A, random vector AX is distributed as X. This is equivalent 
to the fact that the probability distribution of X depends on X only through 
its 2-norm. A typical physical example is that of R 3 — rotations, that are rep- 
resented by orthogonal matrices. Obviously, the physical meaning attached to 
the X— distribution will not change if we rotate the coordinate-system An 
extension of this definition is as follows: a vector X is p-spherically invariant 
if its probability distribution depends only on the p-norm of X. Orthogonal 
invariance corresponds thus to the case p = 2. 

A uniform distribution on the p-sphere in M™ can be obtained by normaliz- 
ing a vector distributed according to any p-spherically invariant distribution as 
follows. 

Theorem 1 An n-variate random vector U is uniformly distributed on the p- 
sphere if it writes 

u= x 



\x\ 



where X is p-spherically invariant |13| . 
Remark that vector U has unit p-norm: 



\U\l 



E 



11; 



l/p 



which is to be regarded as a constraint. The marginal distributions of a uniform 
distribution on the p-sphere in K™ can be easily computed as follows: 

Theorem 2 ifUis uniformly distributed on the p-sphere inW 1 then the marginal 
density of 'V = [Ui, . . . , Uk] T is 



f(ui,...,u k ) 



i) r (^) 



with 1 < k < n — 1. 

(9) 
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Proof. The proof can be found in the first step consists in proving the 
result for k = n — 1 , using the change of variable 



Vi = ^,!<*<n-l 



Vn = \\X\\ p (10) 

the Jacobian of which writes 



ra-l 



J = r»-Ml-X>|*J . (11) 

The second step consists in a proof by backward induction on dimension k: 
assuming the result is true for all Z > k, it is proved for Z = k — 1 by integrating 
over variable Uk the density 

p k v(A ( k x^- 1 

/(«!,. ..,U k ) = KPJ 1-EN P ■ ^ 

ofcpfc / 1 j y I n ~ k x ' 



3.2 Maximum Tsallis entropy distributions 

The so-called Tsallis information measure H q (with q a real parameter called 
the non-extensivity index) associated with a continuous distribution is defined 
as follows: 

H q = A" f 1 - / f q ^)] (13) 



1-9 

As parameter q — > 1, this information measure converges to the classical (Shan- 
non's) measure of information. We are tacitly assuming that mean values are to 
be computed in their customary linear ( in the probabilities ) fashion |15j . Other 
ways of expressing "Tsallis"' expectation values do exist, of course but 
appealing to them here would unnecessarily complicate things and obscure our 
message. The PD with given order— p moment that maximizes information 
measure H q can be characterized as follows |16| . 

Theorem 3 Given q > 1, the following problem 

I = argmax^ (l-Jf«(x)\ (14) 

with EXf = K, (15) 

(16) 

has for unique solution 

k 

f(u 1 ,...,u k ) = (l-J2^\^\ P )^ 1 (17) 

i=l 
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Since each Lagrange multiplier amounts to stretch any component Ui by a factor 
(\i) 1 / p , we conclude that the probability distributions given by © coincides 
with the maximum Tsallis' information measure @] distributions with given 
order-p moment. Thus, we reach the following conclusion: 

Conclusion 4 If [u\, ...,u n ] is uniformly distributed on the p-sphere in R™, 
then all its k-variate marginals maximize Tsallis' entropy with a non-extensivity 
parameter q given by 

n — k ,„ . 

q= 7 ■ (18) 

n — k — p 

We remark moreover that q > provided that (n — k)/p — l > or, equivalently, 
1 < k < n — p. 

For example, in the case of norm-2 (p = 2), only the marginals of dimensions 
1, 2, n — 3 are Tsallis maximizers with a positive parameter q. Additionally, 
the marginal of dimension n — p is uniform in (not onl) the p-sphere in R n_p , 
that is, u\ + ... + uf>_p < 1 (not = 1) and thus maximizes Tsallis' entropy, but 
with q = +oo. Note that the "large dimension" remaining marginals, i.e., the 
ones for which n — p+l < k < n — 1, maximize Tsallis entropy with a parameter 
q < 0. 

Summing up: as the dimension of the marginal decreases, we go from maximizers 
of Tsallis entropies with 

• (A) q < if n-p+l<k<n-l 

• (B) q = +oo if k = n — p 

• (C) q > 1 if k < n - p - 1 

• (D) q ~ 1 for n — > oo. 

For macroscopic systems, item (D) applies (classical statistics). Consider then 
the case n— finite: in most cases of physical relevance, k is small, so that item 
(C) applies. Item (A) corresponds to a situation in which we have a great deal 
of information, that specifies the more important aspects of the problem. Only 
small details remain to be determined. A distribution with q < 0, precisely, am- 
plifies those small details 0]. Item (B) corresponds to a very peculiar situation, 
the uniform distribution, as discussed below. 



3.3 Application 

Suppose we observe a fc-variate random vector Y distributed according to a 
Tsallis distribution with associated parameter q > 1 assumed known (in fact it 
can be estimated easily). The idea is that Y can be interpreted as a restricted 
set of components of a larger system with n > k degrees of freedom, this larger 
system being distributed according to a more natural distribution, namely the 
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uniform distribution on the p-sphere in R". In such a case, n and p are related 
to q and k as prescribed in the preceding Section, namely, 

1 n — k n — k 

7 = l-> 9= ; • 

q — 1 p n — k — p 

This supposes that the n — k remaining variables are hidden or unavailable at 
the time of the measurement. 

Strictly speaking, we recover classical statistics (q = 1) only for n — > oo. Other- 
wise, since k is assumed to be small, q > 1 and we are within the non-extensivity 
realm. This q > 1 restriction on q agrees with considerations recently made 
from an entirely unconnected viewpoint that employs escort distributions and 
Fisher's information measure |17j . For macroscopic systems, however, n is of 
the order of Avogadro's number, and thus q is very close to unity. 
In standard statistical mechanics' text-books (see Section 1) we have p = 2 
and n the number of particles, with n ~ 10 24 . Thus, the classicality criterium 
q = 1 works quite well indeed. What was called the first particle in Section 1 is 
assumed to be a test-particle, representative of the remaining degrees of freedom, 
so that the idea that Y can be regarded as a restricted set of components of a 
larger system with n > k degrees of freedom can be safely ignored. 
Strictly speaking though, this larger system is being distributed according to a 
more natural distribution than the Boltzmann or the Tsallis ones, namely the 
uniform distribution on the p-sphere in M™. 

4 A quantum analogy 

In order to get a better grasp of the changes in the values of q described at 
the end of Section 3, we make now recourse to the following analogy: consider 
a physical situation that revolves around a system that can exist in any of a 
large (discrete) set of (same energy l£ )-states labelled by a quantum number 
i\ i = l,...,n |18| and that our interest lies in the probability distribution 
(PD) pi. In quantum mechanics, these states constitute a basis that spans an 
n-dimcnsional linear vector subspace. All possible physical states of our system 
that have energy E a can be expressed as linear combinations of these basis states 
with complex coefficients, so that we are speaking here of a subset of C n , which 
does not really seriously affect our considerations. Indeed, it has been recently 
pointed out by Caves, Fuchs, and Rungta |19| that real quantum mechanics (that 
is, quantum mechanics defined over real vector spaces [201 El 122 ESj ) provides 
an interesting foil theory whose study may shed some light on which aspects of 
quantum entanglement are unique to standard quantum theory, and which ones 
are more generic over other physical theories endowed with the phenomenon of 
entanglement. 

We assume further that i) we only have access to, say, k < n of these states and 
ii) the (PD) pj (of finding the system in the basis-state j) is uniform both for 
k = n and for k = n — p. 
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Now, it is well known that i) the entropy S is a functional of the probability 
distribution that quantifies the degree of ignorance for a given scenario |3] and 
ii) the uniform distribution always yields the largest possible entropic value |24| . 
In the present circumstances we thus have, for Tsallis' entropy S q , |1(J| 



^(k—n) 



c(k<n) 



so that 



i 1 -^ - 1 
1-9 

' k 



(19) 



o{k—n) c(k<n) 



Erf 



j=l 



(20) 



Di quantifies the information gain (or ignorance loss) that, paradoxically, ensues 
from the fact that one does not have access to n — k states. For q = 1, D\ is 
actually the Kullback-Leibler cross entropy. 

Some refinement is still needed with reference to the above considerations. We 
have seen in Section 3 that a uniform distribution also ensues for k = n — p, 



^(k—n—p) 



(n-p) 1 " 9 - 1 

i~q 



< q(k=n) 



(21) 



It is clear that, according to H19fl . the entropy of a uniform distribution (all the 
pertinent p^s are equal), grows with the corresponding number of states. 
Notice that the above considerations only make sense within the non extensive 
framework. For q = 1 one has n = oo. 

Consider now the particular situation k = n — p+1. Clearly, then, the pertinent 
entropy has to be larger than the uniform one for k = n — p 



5( fe =«-f+ 1 ) = k B {n P+1)1 Q 1 > S*= n => (n - P + l) 1 -" > (n - pf~\ 
H 1 — q H 

(22) 

which implies 

q < 0, (23) 

and we understand now point (A) at the end of Section 3. Conversely, take now 
k = n — p — 1 . A similar line of reasoning yields 



S (k=n-p-l) = kB ( n P I) 1 I 1 < gk=n ^ _ p _ ^l-, < f n _ p \l-q 

q 1 — q H 



(24) 
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which clearly implies 



q>l, (25) 
and we thus understand point (B) at the end of Section 3. 

5 Conclusions 

Boltzmann-Gibbs' and Tsallis' probability distributions can be derived via purely 
geometric arguments starting from a uniform distribution. In particular, we have 
shown that such geometric considerations can be employed in order to deter- 
mine the non-extensivity index q. The way of foxing q remains still an open 
problem for non-extensive thermostatistics, which gives our result an additional 
interest. 
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